Back

Genome Biology and Evolution

Oxford University Press (OUP)

All preprints, ranked by how well they match Genome Biology and Evolution's content profile, based on 280 papers previously published here. The average preprint has a 0.08% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.

1
ORFanes in mitochondrial genomes of marine polychaete Polydora

Selifanova, M.; Demianchenko, O.; Noskova, E.; Pitikov, E.; Skvortsov, D.; Drozd, J.; Vatolkina, N.; Apel, P.; Kolodyazhnaya, E.; Ezhova, M. A.; Tzetlin, A. B.; Neretina, T. V.; Knorre, D. A.

2023-02-04 evolutionary biology 10.1101/2023.02.04.527105 medRxiv
Top 0.1%
54.3%
Show abstract

Most characterised metazoan mitochondrial genomes are compact and encode a small set of proteins that are essential for oxidative phosphorylation. However, in rare cases, invertebrate taxa have additional open reading frames (ORFs) in their mtDNA sequences. Here, we sequenced and analysed the mitochondrial genome of a polychaete worm, Polydora cf. ciliata, part of whose life cycle takes place in low-oxygen conditions. In the mitogenome, we found three "ORFane" regions (1063, 427, and 519 bp) that have no resemblance to any standard metazoan mtDNA gene but lack stop codons in one of the reading frames. Similar regions are found in the mitochondrial genomes of three other Polydora species and Bocardiella hamata. All five species share the same gene order in their mitogenomes, which differ from that of other known spionidae mitogenomes. By analysing the ORFane sequences, we found that they are under negative selection pressure, contain conservative regions, and harbour predicted transmembrane domains.The codon adaptation indices (CAIs) of the ORFan genes were in the same range of values as the CAI of conventional protein-coding genes in corresponding mitochondrial genomes. Together, this suggests that ORFanes encode functional proteins. We speculate that the ORFanes originated from the conventional mitochondrial protein-coding genes which were duplicated when the Polydora/Bocardiella species complex separated from the rest of the Spionidae. Significance statementMetazoan mitochondrial genomes usually contain a conservative set of genes and features. However, mitogenomes of some species contain ORFanes - putative protein-coding genes without clear homology with other known sequences. In this study, we analysed three ORFanes in mitochondria of species of the genera Polydora and Bocardiella, which were absent in all other representatives of Spionidae. To the best of our knowledge, ORFanes havent been described in Annelida before. Sequence analysis of the ORFanes suggests they contain conservative regions and are likely translated into functional proteins. Our study features an uncommon case where new protein-coding genes emerged in the mitochondrial genomes of metazoa.

2
Genomic analysis of laboratory-evolved, heat-adapted Escherichia coli strains

McGuire, B. E.; Nano, F. E.

2024-10-03 evolutionary biology 10.1101/2024.10.01.616104 medRxiv
Top 0.1%
40.5%
Show abstract

Adaptive laboratory evolution to high incubation temperatures represents a complex evolutionary problem, and each study to date performed in Escherichia coli has resulted in a different set of mutations. We performed adaptive laboratory evolution of E. coli to heat by passaging a culture at elevated temperatures for 150 days. Throughout the adaptive evolution we expressed a set of genes that induce hyper-mutagenesis. These growth conditions yielded a strain with a maximum growth temperature approximately 2 {degrees}C above that of the parental strain. We preserved evolved isolates weekly and obtained and analyzed whole-genome sequencing data for three isolates from different time points. We identified hundreds of mutations, including mutations in components of the RNA polymerase (RpoB, RpoC and RpoD), Rho, and the heat shock proteins GroES, GroEL, DnaK, ClpB, IbpA and HslU. We compared the proteomes of the starting strain and final strain grown at 37 {degrees}C and 42.5 {degrees}C and identified changes in abundance between samples for GroESL, HslVU, DnaK, ClpB and other important proteins. This study details a distinct evolutionary route towards enhanced thermotolerance, contributes to our understanding of adaptation to heat in Escherichia coli and may provide insights into heat adaptation in other organisms.

3
De novo ORFs are more likely to shrink than to elongate during neutral evolution.

Lebherz, M. K.; Iyengar, B. R.; Bornberg-Bauer, E.

2024-02-12 evolutionary biology 10.1101/2024.02.12.579890 medRxiv
Top 0.1%
39.1%
Show abstract

For protein coding genes to emerge de novo from a non-genic DNA, the DNA sequence must gain an open reading frame (ORF) and the ability to be transcribed. The newborn de novo gene can further evolve to accumulate changes in its sequence. Consequently, it can also elongate or shrink with time. Existing literature shows that older de novo genes have longer ORF, but it is not clear if they elongated with time or remained of the same length since their inception. To address this question we developed mathematical model of ORF elongation as a Markov-jump process, and show that ORFs tend to keep their length in short evolutionary timescales. We also show that if change occurs it is likely to be a truncation. Our genomics and transcriptomics data analyses of seven Drosophila melanogaster populations is also in agreement with the models prediction. We conclude that selection could facilitate ORF length extension that may explain why longer ORFs were observed in old de novo genes in studies analysing longer evolutionary time scales. SignificanceNew protein coding genes can emerge from non-genic DNA through a process called de novo gene emergence. Genes thus emerged usually have a small open reading frame (ORF). However, studies show that de novo genes with an older evolutionary origin have longer ORF than younger genes. To understand how ORF length evolves, we use a combination of mathematical modeling and population level genome data analysis. We find that in the absence of evolutionary selection, ORFs tend to become shorter than becoming longer. Therefore, long ORFs are probably selected by evolution to be retained in the genome.

4
Transposable element dynamics are consistent across the Drosophila phylogeny, despite drastically differing content

Hill, T.

2019-07-26 genomics 10.1101/651059 medRxiv
Top 0.1%
38.2%
Show abstract

BackgroundThe evolutionary dynamics of transposable elements (TEs) vary across the tree of life and even between closely related species with similar ecologies. In Drosophila, most of the focus on TE dynamics has been completed in Drosophila melanogaster and the overall pattern indicates that TEs show an excess of low frequency insertions, consistent with their frequent turn over and high fitness cost in the genome. Outside of D. melanogaster, insertions in the species Drosophila algonquin, suggests that this situation may not be universal, even within Drosophila. Here we test whether the pattern observed in D. melanogaster is similar across five Drosophila species that share a common ancestor more than fifty million years ago.\n\nResultsFor the most part, TE family and order insertion frequency patterns are broadly conserved between species, supporting the idea that TEs have invaded species recently, are mostly costly and dynamics are conserved in orthologous regions of the host genome\n\nConclusionsMost TEs retain similar activities and fitness costs across the Drosophila phylogeny, suggesting little evidence of drift in the dynamics of TEs across the phylogeny, and that most TEs have invaded species recently.

5
Gene loss and acquisition in lineages of bacteria evolving in a human host environment

Gabrielaite, M.; Johansen, H. K.; Molin, S.; Nielsen, F. C.; Marvig, R. L.

2020-02-03 evolutionary biology 10.1101/2020.02.03.931667 medRxiv
Top 0.1%
37.3%
Show abstract

While genome analyses have documented that there are differences in the gene repertoire between evolutionary distant lineages of the same bacterial species, less is known about micro-evolutionary dynamics of gene loss and acquisition within lineages of bacteria as they evolve over the timescale of years. This knowledge is valuable to understand both the basic mutational steps that on long timescales lead to evolutionary distant bacterial lineages, and the evolution of the individual lineages themselves. In the case that lineages evolve in a human host environment, gene loss and acquisition may furthermore have implication for disease. We analyzed the genomes of 45 Pseudomonas aeruginosa lineages evolving in the lungs of cystic fibrosis patients to identify genes that are lost or acquired during the first years of infection in each of the different lineages. On average, the lineage genome content changed with 88 genes (range 0-473). Genes were more often lost than acquired, and prophage genes were more variable than bacterial genes. We identified genes that were lost or acquired independently across different clonal lineages, i.e. convergent molecular evolution. Convergent evolution suggests that there is a selection for loss and acquisition of certain genes in the host environment. We find that a significant proportion of such genes are associated with virulence; a trait previously shown to be important for adaptation. Furthermore, we also compared the genomes across lineages to show that within-lineage variable genes more often belonged to genomic content not shared across all lineages. Finally, we used 4,760 genes shared by 446 P. aeruginosa genomes to develop a stable and discriminatory typing scheme for P. aeruginosa clone types (Pactyper, https://github.com/MigleSur/Pactyper). In sum, our analysis adds to the knowledge on the pace and drivers of gene loss and acquisition in bacteria evolving over multiple years in a human host environment and provides a basis to further understand how gene loss and acquisition plays a role in lineage differentiation and host adaptation. Data SummaryP. aeruginosa genome sequencing data has been made publicly available by Marvig et al. (2015) and is deposited in Sequence Read Archive (SRA) under accession ERP004853.

6
Genome-Wide Search for Candidate Drivers of Adaptation Reveals Genes Enriched for Shifts in Purifying Selection (SPurS)

Popejoy, A. B.; Domanska, D.; Thomas, J. H.

2020-01-13 evolutionary biology 10.1101/2020.01.11.902759 medRxiv
Top 0.1%
36.8%
Show abstract

An open question in comparative evolutionary genomics is whether or not certain loci are the primary drivers of divergence between taxonomic lineages or species groups. Alternatively, genetic drivers of species divergence may be evenly distributed across the genome. The increasing availability of genome sequences from diverse taxa has enabled the development of novel methods to address this question. Genomes of many highly diverged species may now be compared in order to tease apart genetic differences that drive adaptive or functional divergence, and genetic differences that are observed by chance and are not causally linked to traits that differ between species or lineages. In order to test the hypothesis that a particular subset of loci or genes is responsible for driving adaptive changes between mammals and non-mammals, we developed a novel comparative approach to identify sites that are highly conserved within lineages or species groups and diverge between them. Loci with a high concentration of these sites may be called Shifts in Purifying Selection (SPurS) because a change has occurred between two groups of species at some point in the past, and the shift is conserved (via purifying selection) over a long period of time. Evaluating 7484 orthologous gene copies from 76 vertebrate species, we developed an empirical distribution of SPurS across the genome between Synapsida (placental and non-placental mammals) and Sauropsida (birds, crocodilians, squamates, and turtles), and compared this distribution to the expected null distribution of SPurS using matched simulated data. We then identified a subset of genes that is enriched for SPurS, relative to the full set of genes and to their matched simulated alignments. These SPurS-enriched genes are thus likely candidate drivers of functional divergence or adaptation between the mammalian and non-mammalian species groups in our analysis. Investigators seeking to identify genetic drivers of inter-species evolution may find this method useful, and we provide a web-based software interface to facilitate its use.

7
New estimates of genome size in Orthoptera and their evolutionary implications

Hawlitschek, O.; Sadilek, D.; Lara-Sophie, D.; Buchholz, K.; Noori, S.; Baez, I. L.; Wehrt, T.; Brozio, J.; Travnicek, P.; Seidel, M.; Husemann, M.

2022-09-22 genomics 10.1101/2022.09.21.508865 medRxiv
Top 0.1%
33.8%
Show abstract

Animal genomes vary widely in size, and much of their architecture and content remains poorly understood. Even among related groups, such as orders of insects, genomes may vary in size by orders of magnitude - for reasons unknown. The largest known insect genomes were repeatedly found in Orthoptera, e.g., Podisma pedestris (1C = 16.93 pg), Stethophyma grossum (1C = 18.48 pg) and Bryodemella holdereri (1C = 18.64 pg). While all these species belong to the suborder of Caelifera, the ensiferan Deracantha onos (1C = 19.60 pg) was recently found to have the largest genome. Here, we present new genome size estimates of 50 further species of Ensifera (superfamilies Gryllidea, Tettigoniidea) and Caelifera (Acrididae, Tetrigidae) based on flow cytometric measurements. We found that Bryodemella tuberculata (Caelifera: Acrididae) has the so far largest measured genome of all insects with 1C = 21.96 pg (21.48 gBp). Species with 2n = 16 and 2n = 22 chromosomes have significantly larger genomes than species with other chromosome counts. Gryllidea genomes vary between 1C = 0.95 and 2.88 pg, and Tetrigidae between 1C = 2.18 and 2.41, while the genomes of all other studied Orthoptera range in size from 1C = 1.37 to 21.96 pg. Reconstructing ancestral genome sizes based on a phylogenetic tree of mitochondrial genomic data, we found genome size values of >15.84 pg only for the nodes of Bryodemella holdereri / B. tuberculata and Chrysochraon dispar / Euthystira brachyptera. The predicted values of ancestral genome sizes are 6.19 pg for Orthoptera, 5.37 pg for Ensifera, and 7.28 pg for Caelifera. The reasons for the large genomes in Orthoptera remain largely unknown, but a duplication seems unlikely as chromosome numbers do not differ. Sequence-based genomic studies may shed light on the underlying evolutionary mechanisms.

8
Two decades of suspect evidence for adaptive DNA-sequence evolution - Less negative selection misconstrued as positive selection

Chen, Q.; He, Z.; Feng, X.; Yang, H.; Shi, S.; Wu, C.-I.

2020-04-23 evolutionary biology 10.1101/2020.04.21.049973 medRxiv
Top 0.1%
33.2%
Show abstract

Evidence for biological adaptation is often obtained by studying DNA sequence evolution. Since the analyses are affected by both positive and negative selection, studies usually assume constant negative selection in the time span of interest. For this reason, hundreds of studies that conclude adaptive evolution might have reported false signals caused by relaxed negative selection. We test this suspicion two ways. First, we analyze the fluctuation in population size, N, during evolution. For example, the evolutionary rate in the primate phylogeny could vary by as much as 2000 fold due to the variation in N alone. Second, we measure the variation in negative selection directly by analyzing the polymorphism data from four taxa (Drosophila, Arabidopsis, primates, and birds, with 64 species in total). The strength of negative selection, as measured by the ratio of nonsynonymous/synonymous polymorphisms, fluctuates strongly and at multiple time scales. The two approaches suggest that the variation in the strength of negative selection may be responsible for the bulk of the reported adaptive genome evolution in the last two decades. This study corroborates the recent report1 on the inconsistent patterns of adaptive genome evolution. Finally, we discuss the path forward in detecting adaptive sequence evolution.

9
Sequence type diversity amongst antibiotic-resistant bacterial strains is lower than amongst antibiotic-susceptible strains

Pradhananga, A.; Benitez Rivera, L.; Clark, C.; Tisthammer, K.; Pennings, P. S.

2022-11-24 evolutionary biology 10.1101/2022.11.23.517742 medRxiv
Top 0.1%
33.0%
Show abstract

The increasing number of antibiotic resistant bacterial infections is a global threat to human health. Antibiotic resistant bacterial strains generally evolve from susceptible strains by either horizontal gene transfer or chromosomal mutations. After evolving within a host, such resistant strains can be transmitted to other hosts and increase in frequency in the population at large. Population genetic theory postulates that the increase in frequency of an adaptive trait can lead to signatures of selective sweeps. One would thus expect to observe reduced genetic diversity amongst that part of the population that carries the adaptive trait. Specifically, if the evolution of new resistant strains is rare, it is expected that resistant strains represent only a subset of the diversity of susceptible strains. It is currently unknown if diversity of resistant strains is indeed lower than diversity of susceptible strains when considering antibiotic resistance. Here we show that in several bacterial species in several different datasets, sequence-type diversity amongst antibiotic-resistant bacterial strains is indeed lower than amongst antibiotic-susceptible strains in most cases. We re-analysed eight existing clinical datasets with Escherichia coli, Staphylococcus aureus and Enterococcus faecium samples. These datasets consisted of 53 - 1094 patient samples, with multi-locus sequence types and antibiotic resistance phenotypes for 3 - 19 different antibiotics. Out of 59 comparisons, we found that resistant strains were significantly less diverse than susceptible strains in 51 cases (86%). In addition, we show that sequence-type diversity of antibiotic-resistant strains is lower if resistance is rare, compared to when resistance is common, which is consistent with rare resistance being due to fewer evolutionary origins. Our results show that for several different bacterial species, we observe reduced diversity of resistant strains, which is consistent with the evolution of resistance driven by selective sweeps stemming from a limited number of evolutionary origins. In future studies, more detailed analysis of such sweep signatures is warranted.

10
Two decades of suspect evidence for adaptive DNA-sequence evolution - Failure in consistent detection of positive selection

He, Z.; Chen, Q.; Yang, H.; Chen, Q.; Shi, S.; Wu, C.-I.

2020-04-21 evolutionary biology 10.1101/417717 medRxiv
Top 0.1%
32.9%
Show abstract

A recent study suggests that the evidence of adaptive DNA sequence evolution accumulated in the last 20 years may be suspect1. The suspicion thus calls for a re-examination of the reported evidence. The two main lines of evidence are from the McDonald-Kreitman (MK) test, which compares divergence and polymorphism data, and the PAML test, which analyzes multi-species divergence data. Here, we apply these two tests concurrently on the genomic data of Drosophila and Arabidopsis. To our surprise, the >100 genes identified by the two tests do not overlap beyond random expectations. The results could mean i) high false positives by either test or ii) high false-negatives by both tests due to low powers. To rule out the latter, we merge every 20 - 30 genes into a "supergene". At the supergene level, the power of detection is high, with 8% - 56% yielding adaptive signals. Nevertheless, the calls still do not overlap. Since it is unlikely that one test is largely correct and the other is mostly wrong (see Discussion), the total evidence of adaptive DNA sequence evolution should be deemed unreliable. As suggested by Chen et al.1, the reported evidence for positive selection may in fact be signals of fluctuating negative selection, which are handled differently by the two tests. Possible paths forward on this central evolutionary issue are discussed.

11
Assessing positive selection in centromere-associated kinetochore proteins across Metazoan groups.

Healey, H. M.; Gomez, L. E.; Sheikh, S. I.; Camel, B. R.; Forbes, A. A.; Sterner, K. N.; Beck, E. A.

2026-02-18 evolutionary biology 10.64898/2026.02.13.705784 medRxiv
Top 0.1%
32.7%
Show abstract

Centromeres are comprised of long stretches of repetitive DNA that evolve rapidly in organisms across the tree of life. Consistent selfish centromere evolution can also have cascading effects - driving rapid evolution in interacting kinetochore proteins - possibly to maintain centromere-kinetochore compatibility. Effects of selfishly evolving centromeres on interacting proteins are most heavily studied in the inner kinetochore and assembly proteins including the constitutive centromere-associated network proteins CENP-A and CENP-C with some exploration of the extended effects to other kinetochore-associated protein complexes. While rapid evolution of the centromere has been broadly studied in many organisms, studies assessing positive selection in centromere-associated kinetochore proteins have largely focused on Drosophila. Here, we tested the hypothesis that signatures of positive selection would be present in outer kinetochore and condensin genes in diverse animal groups. We selected two protein complexes -the Condensin I complex and the Mis12 Complex - to test for positive selection in parasitic wasps, two groups of ray-finned fishes (including the amazon molly an asexual diploid exempt from centromere drive), and two groups of primates. We did not find selection using any test in any protein in the amazon molly but did find sporadic positive selection in proteins in both complexes across all groups.

12
The first chromosome-scale Dugesia genomes shed light on structural rearrangements and genome size evolution in flatworms

Dols-Serrate, D.; Tenaguillo-Arriola, I.; Pisarenco, V. A.; Olive-Muniz, M.; Rozas, J.; Riutort, M.

2025-12-11 genomics 10.64898/2025.12.11.693146 medRxiv
Top 0.1%
32.4%
Show abstract

AbstractHigh-quality, chromosome-scale genomes are crucial for understanding biological processes, yet many metazoan lineages, including most Lophotrochozoa, remain underrepresented in genome databases. Among these, planarians (Platyhelminthes, Tricladida), particularly Dugesia, are a globally distributed and phenotypically diverse group that has become an important model in evolutionary biology, notably for investigating the genetic effects of agametic asexuality. However, the lack of chromosome-scale assemblies has limited progress. Here, we present the first chromosome-scale genomes of four Western Mediterranean Dugesia species, displaying the first intra- and intergeneric comparisons. Comparison with the regeneration model organism Schmidtea mediterranea, rejects a whole-genome duplication as the cause of differences in chromosomal number and genome size between genera. Instead, Dugesia shows extensive lineage-specific and differential expansions of DNA transposable elements, likely contributing to genome size variation during diversification. Despite differences in the dynamics of structural genome rearrangements observed between genera, both groups lack the conservation of ancestral metazoan linkage groups, supporting the idea that genome structural instability is a key feature of flatworm genome evolution. Our newly generated genomic resources and findings offer vital insights into the genetic basis of diversification and establish Dugesia as a valuable model for studying metazoan genome dynamics, including the evolution of alternative reproductive systems.

13
Surprising amount of stasis in repetitive genome content across the Brassicales

Beric, A.; Mabry, M. E.; Harkess, A. E.; Schranz, M. E.; Conant, G. C.; Edger, P. P.; Meyers, B. C.; Pires, J. C.

2020-06-15 evolutionary biology 10.1101/2020.06.15.153296 medRxiv
Top 0.1%
32.2%
Show abstract

Genome size of plants has long piqued the interest of researchers due to the vast differences among organisms. However, the mechanisms that drive size differences have yet to be fully understood. Two important contributing factors to genome size are expansions of repetitive elements, such as transposable elements (TEs), and whole-genome duplications (WGD). Although studies have found correlations between genome size and both TE abundance and polyploidy, these studies typically test for these patterns within a genus or species. The plant order Brassicales provides an excellent system to test if genome size evolution patterns are consistent across larger time scales, as there are numerous WGDs. This order is also home to one of the smallest plant genomes, Arabidopsis thaliana - chosen as the model plant system for this reason - as well as to species with very large genomes. With new methods that allow for TE characterization from low-coverage genome shotgun data and 71 taxa across the Brassicales, we find no correlation between genome size and TE content, and more surprisingly we identify no significant changes to TE landscape following WGD.

14
Evolutionary trajectories of secondary replicons in multipartite genomes

Dranenko, N. O.; Rodina, A. D.; Demenchuk, Y. V.; Gelfand, M. S.; Bochkareva, O. O.

2023-04-09 genomics 10.1101/2023.04.09.536151 medRxiv
Top 0.1%
32.0%
Show abstract

Most bacterial genomes have a single chromosome that may be supplemented by smaller, dispensable plasmids. However, approximately 10% of bacteria with completely sequenced genomes, mostly pathogens and plant symbionts, have more than one stable large replicon. Some secondary replicons are species-specific, carrying pathogenicity or symbiotic factors. Other replicons are common on at least the genus level, carry house-keeping genes, and may have a size of several million base pairs. We analyzed the abundance and sizes of large secondary replicons in different groups of bacteria and identified two patterns in the evolution of multipartite genomes. In nine genera of four families, Pseudoalteromonadaceae, Burkholderiaceae, Vibrionaceae, and Brucellaceae, we observed a positive correlation between the sizes of the chromosome and the secondary replicon with the slope in the range of 0.6-1.2. This indicates that in these genera the replicons evolve in a coordinated manner, with comparable rates of gene gain/loss, hence supporting classification of such secondary replicons as chromids. The second, more common pattern, features gene gains and losses mainly occurring in the primary replicon, yielding a stable size of the secondary replicon. Such secondary replicons are usually present in only a low fraction of the genus species. Hence, such replicons behave as megaplasmids. A mixed situation was observed in symbiotic genera from the Rhizobiaceae family where the large secondary replicons are of stable size, but present in all species. These results may provide a general framework for understanding the evolution of genome complexity in prokaryotes. SignificanceLarge secondary replicons are observed in representatives of many taxonomic groups of bacteria. Traditionally, they are referred to as second chromosomes, chromids, or megaplasmids, with little consistency, in particular because their evolution remains understudied. Here we demonstrate that the sizes of secondary replicons follow two main evolutionary trends: replicons whose size scales linearly with the size of the main chromosome (the suggested term chromids) typically contain numerous essential genes (rRNA, tRNA, ribosomal proteins), while large secondary replicons of stable size (termed megaplasmids) contain fewer or none such genes.

15
Evolution Under Thermal Stress Affects Escherichia colis Resistance to Antibiotics

Bullivant, A.; Lozano-Huntelman, N.; Tabibian, K.; Leung, V.; Armstrong, D.; Dudley, H.; Savage, V. M.; Rodriguez-Verdugo, A.; Yeh, P.

2024-03-03 evolutionary biology 10.1101/2024.02.27.582334 medRxiv
Top 0.1%
31.9%
Show abstract

Exposure to both antibiotics and temperature changes can induce similar physiological responses in bacteria. Thus, changes in growth temperature may affect antibiotic resistance. Previous studies have found that evolution under antibiotic stress causes shifts in the optimal growth temperature of bacteria. However, little is known about how evolution under thermal stress affects antibiotic resistance. We examined 100+ heat-evolved strains of Escherichia coli that evolved under thermal stress. We asked whether evolution under thermal stress affects optimal growth temperature, if there are any correlations between evolving in high temperatures and antibiotic resistance, and if these strains antibiotic efficacy changes depending on the local environments temperature. We found that: (1) surprisingly, most of the heat-evolved strains displayed a decrease in optimal growth temperature and overall growth relative to the ancestor strain, (2) there were complex patterns of changes in antibiotic resistance when comparing the heat-evolved strains to the ancestor strain, and (3) there were few significant correlations among changes in antibiotic resistance, optimal growth temperature, and overall growth. ImportanceEscherichia coli, a bacteria species often found within the intestinal tract of warm-blooded organisms, can be harmful to humans. Like all species of bacteria, E. coli can evolve, particularly in the presence of stressful conditions such as extreme temperatures or antibiotic treatments. Recent evidence suggests that when encountering one source of stress, an organisms ability to deal with a different source of stress is also affected. With global climate change and the continued evolution of antibiotic-resistant bacteria, the need to further investigate how temperature and antibiotics interact is clear. The significance of our research is in identifying possible correlations between temperature and antibiotic stress, broadening our understanding of how stressors affect organisms, and allowing for insights into possible future evolutionary pathways.

16
A chromosome-level genome for the nudibranch gastropod Berghia stephanieae helps parse clade-specific gene expression in novel and conserved phenotypes

Goodheart, J. A.; Rio, R. A.; Taraporevala, N. F.; Fiorenza, R. A.; Barnes, S. R.; Morrill, K.; Jacob, M. A. C.; Whitesel, C.; Masterson, P.; Batzel, G. O.; Johnston, H. T.; Ramirez, M. D.; Katz, P. S.; Lyons, D.

2023-11-16 genomics 10.1101/2023.08.04.552006 medRxiv
Top 0.1%
28.2%
Show abstract

How novel phenotypes originate from conserved genes, processes, and tissues remains a major question in biology. Research that sets out to answer this question often focuses on the conserved genes and processes involved, an approach that explicitly excludes the impact of genetic elements that may be classified as clade-specific, even though many of these genes are known to be important for many novel, or clade-restricted, phenotypes. This is especially true for understudied phyla such as mollusks, where limited genomic and functional biology resources for members of this phylum has long hindered assessments of genetic homology and function. To address this gap, we constructed a chromosome-level genome for the gastropod Berghia stephanieae (Valdes, 2005) to investigate the expression of clade-specific genes across both novel and conserved tissue types in this species. The final assembled and filtered Berghia genome is comparable to other high quality mollusk genomes in terms of size (1.05 Gb) and number of predicted genes (24,960 genes), and is highly contiguous. The proportion of upregulated, clade-specific genes varied across tissues, but with no clear trend between the proportion of clade-specific genes and the novelty of the tissue. However, more complex tissue like the brain had the highest total number of upregulated, clade-specific genes, though the ratio of upregulated clade-specific genes to the total number of upregulated genes was low. Our results, when combined with previous research on the impact of novel genes on phenotypic evolution, highlight the fact that the complexity of the novel tissue or behavior, the type of novelty, and the developmental timing of evolutionary modifications will all influence how novel and conserved genes interact to generate diversity.

17
Rapid evolution and comparative analysis of piRNA clusters in D.simulans

Narayanan, P.; Srivastav, S.; Signor, S.

2026-01-20 evolutionary biology 10.64898/2026.01.19.700409 medRxiv
Top 0.1%
28.0%
Show abstract

Eukaryotic genomes are ubiquitously occupied by mobile genetic elements termed transposons, which are silenced via a specialized class of small RNA called piRNA. The small RNA is produced from the transposons themselves when they occupy specialized regions of the genome termed piRNA clusters. The formation of these specialized regions, or their evolution over time, is not well understood. Recent work has suggested that they are extremely variable even within a single species such as Drosophila melanogaster. We were interested in taking a comparative approach to piRNA cluster evolution to ask the question - what processes are unique to D. melanogaster and which are shared? Shared phenomena are more likely to be fundamental aspects of piRNA formation and evolution compared to those that are more labile. Using five high-quality long-read genome assemblies and five genotype-specific piRNA libraries, we approach this question from a population genetics standpoint. We annotate piRNA clusters, transposons, and structural variants in each of these five genomes. We found extensive variation in piRNA clusters across strains, with smaller piRNA clusters more likely to be limited to a single genotype. By and large, our results are consistent with a model of piRNA cluster evolution in which piRNA clusters are rapidly formed and lost, with a small subset increasing in frequency and length over time. However, we find that the TEs which nucleate the formation of small piRNA clusters are entirely distinct in D. simulans compared to D. melanogaster, and likely reflect its invasion history rather than any inherent property of the transposon to nucleate clusters. Therefore, while large common clusters can act as traps as has been posited for piRNA clusters, there are also numerous small clusters that are born and lost rapidly within a species.

18
The slow evolving genome of the xenacoelomorph worm Xenoturbella bocki

Schiffer, P. H.; Natsidis, P.; Leite, D. J.; Robertson, H.; Lapraz, F.; Marletaz, F.; Fromm, B.; Baudry, L.; Simpson, F.; Hoye, E.; Zakrzewski, A.-C.; Kapli, P.; Hoff, K. J.; Mueller, S.; Marbouty, M.; Marlow, H.; Copley, R. H.; Sarkies, P.; Telford, M. J.

2022-06-27 evolutionary biology 10.1101/2022.06.24.497508 medRxiv
Top 0.1%
27.5%
Show abstract

The evolutionary origins of Bilateria remain enigmatic. One of the more enduring proposals highlights similarities between a cnidarian-like planula larva and simple acoel-like flatworms. This idea is based in part on the view of the Xenacoelomorpha as an outgroup to all other bilaterians which are themselves designated the Nephrozoa (protostomes and deuterostomes). Genome data can help to elucidate phylogenetic relationships and provide important comparative data. Here we assemble and analyse the genome of the simple, marine xenacoelomorph Xenoturbella bocki, a key species for our understanding of early bilaterian and deuterostome evolution. Our highly contiguous genome assembly of X. bocki has a size of [~]111 Mbp in 18 chromosome like scaffolds, with repeat content and intron, exon and intergenic space comparable to other bilaterian invertebrates. We find X. bocki to have a similar number of genes to other bilaterians and to have retained ancestral metazoan synteny. Key bilaterian signalling pathways are also largely complete and most bilaterian miRNAs are present. We conclude that X. bocki has a complex genome typical of bilaterians, in contrast to the apparent simplicity of its body plan. Overall, our data do not provide evidence supporting the idea that Xenacoelomorpha are a primitively simple outgroup to other bilaterians and gene presence/absence data support a relationship with Ambulacraria.

19
Lagging strand encoding promotes adaptive evolution

Merrikh, C.; Harris, L.; Mangiameli, S.; Merrikh, H.

2020-09-11 evolutionary biology 10.1101/2020.06.23.167650 medRxiv
Top 0.1%
27.3%
Show abstract

Cells may be able to promote adaptive evolution in a gene-specific and temporally-controlled manner. Genes encoded on the lagging strand have a higher mutation rate and evolve faster than genes on the leading strand. This effect is likely driven by head-on replication-transcription conflicts, which occur when lagging strand genes are transcribed during DNA replication. We previously suggested that the ability to selectively increase mutagenesis in a subset of genes may provide an adaptive advantage for cells. However, it is also possible that this effect could be neutral or even highly deleterious. Distinguishing between these models is important because, if the adaptive model is correct, it would indicate that 1) head-on conflicts, which are generally deleterious, can also provide a benefit to cells, and 2) cells possess the remarkable ability to fine-tune adaptive evolution. Furthermore, investigating these models may address the long-standing debate regarding whether accelerated evolution through conflicts can be adaptive. To distinguish between the adaptive and neutral models, we conducted single nucleotide polymorphism (SNP) analyses on wild strains of bacteria, from divergent phyla. To test the adaptive hypothesis, we analyzed convergent mutation patterns. As a simple test of the neutral hypothesis, we performed in silico modeling. Our results show that convergent mutations are enriched in lagging strand genes and that these mutations are unlikely to have arisen by chance. Additionally, we observe that convergent mutation frequency has a stronger positive correlation with gene-length in lagging strand genes. This effect strongly suggests that head-on conflicts between the DNA replication and transcription machineries are a key mechanism driving the formation of convergent mutations. Together, our data indicate that head-on replication-transcription conflicts can promote adaptive evolution in a variety of bacterial species, and potentially other organisms.

20
Evolutionary persistence of a highly prevalent multicopy mitochondrial-derived nuclear insertion (Mega-NUMT) in Neotropical Drosophila flies

Montoliu-Nerin, M.; Strunov, A.; Heyworth, E.; Schneider, D. I.; Thoma, J.; Hua-Van, A.; Courret, C.; Klasson, L. J.; Miller, W. J.

2026-04-01 evolutionary biology 10.64898/2026.03.31.715258 medRxiv
Top 0.1%
26.4%
Show abstract

BackgroundAlthough strict maternal transmission of mitochondria is a general feature of animals and humans for ensuring homogeneity in mitochondrial DNA (mtDNA) across generations, exceptions were reported in the recent past. For example, some extremely rare but spectacular cases of heteroplasmy and paternal transmission in humans have questioned the universal evolutionary principle. Hence, as an alternative, the Mega-NUMT concept was coined to explain this discovery and was thereafter partly proven to exist. This concept expands on the quite common transfer of mtDNA fragments to the nucleus (NUMTs) by considering the existence of multicopy mitochondrial nuclear insertions. Mega-NUMT reports are currently restricted to a few cases in animals, including humans. However, even in humans, their detailed genomic organization, natural prevalence, and potential biological functions remain unclear. Methodology/Principal FindingsHere, we discovered that up to 60 full-sized mitochondrial genomes are integrated into the nuclear genome of the neotropical fruit fly Drosophila paulistorum using long-read sequencing and confirmed their presence by in situ hybridization. The copies are organized in one cluster on chromosome 3, which we, due to its similarity with the Mega-NUMT concept, designated the "Dpau Mega-NUMT". Contrary to the rarity in humans, this Mega-NUMT is found at high prevalence (40%) in both long-term laboratory lines and natural D. paulistorum populations of different semispecies. Additionally, the mitochondrial copies in the Mega-NUMT cluster are phylogenetically separated from the current mitotypes of D. paulistorum. Together, these observations suggest long-term maintenance of the Mega-NUMT in nature. Hence, we propose that the Dpau Mega-NUMT may have been transferred to the nuclear genome before D. paulistorum semispecies radiation and maintained at relatively high prevalence in nature by balancing selection due to yet undetermined functions. Conclusions/SignificanceTo our knowledge, this is the first verified existence and detailed dissection of a Mega-NUMT outside cats and humans. We show that Mega-NUMTs can be persistent in nature, even at high prevalence, potentially due to balancing selection. Our findings strengthen the importance of high-quality long-read sequencing technologies for deciphering complex repeat-rich genomic regions to deepen our understanding of the dynamics of genome evolution within genomic "dark matter".